Asynchronous Stochastic Coordinate Descent: Parallelism and Convergence Properties

نویسندگان

  • Ji Liu
  • Stephen J. Wright
چکیده

We describe an asynchronous parallel stochastic proximal coordinate descent algorithm for minimizing a composite objective function, which consists of a smooth convex function plus a separable convex function. In contrast to previous analyses, our model of asynchronous computation accounts for the fact that components of the unknown vector may be written by some cores simultaneously with being read by others. Despite the complications arising from this possibility, the method achieves a linear convergence rate on functions that satisfy an optimal strong convexity property and a sublinear rate (1/k) on general convex functions. Near-linear speedup on a multicore system can be expected if the number of processors is O(n1/4). We describe results from implementation on ten cores of a multicore processor.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Asynchronous Accelerated Stochastic Gradient Descent

Stochastic gradient descent (SGD) is a widely used optimization algorithm in machine learning. In order to accelerate the convergence of SGD, a few advanced techniques have been developed in recent years, including variance reduction, stochastic coordinate sampling, and Nesterov’s acceleration method. Furthermore, in order to improve the training speed and/or leverage larger-scale training data...

متن کامل

Asynchronous Decentralized Parallel Stochastic Gradient Descent

Recent work shows that decentralized parallel stochastic gradient decent (D-PSGD) can outperform its centralized counterpart both theoretically and practically. While asynchronous parallelism is a powerful technology to improve the efficiency of parallelism in distributed machine learning platforms and has been widely used in many popular machine learning softwares and solvers based on centrali...

متن کامل

An Asynchronous Parallel Stochastic Coordinate Descent Algorithm

We describe an asynchronous parallel stochastic coordinate descent algorithm for minimizing smooth unconstrained or separably constrained functions. The method achieves a linear convergence rate on functions that satisfy an essential strong convexity property and a sublinear rate (1/K) on general convex functions. Near-linear speedup on a multicore system can be expected if the number of proces...

متن کامل

PASSCoDe: Parallel ASynchronous Stochastic dual Co-ordinate Descent

Stochastic Dual Coordinate Descent (DCD) is one of the most efficient ways to solve the family of `2-regularized empirical risk minimization problems, including linear SVM, logistic regression, and many others. The vanilla implementation of DCD is quite slow; however, by maintaining primal variables while updating dual variables, the time complexity of DCD can be significantly reduced. Such a s...

متن کامل

Asynchronous Doubly Stochastic Proximal Optimization with Variance Reduction

In the big data era, both of the sample size and dimension could be huge at the same time. Asynchronous parallel technology was recently proposed to handle the big data. Specifically, asynchronous stochastic (variance reduction) gradient descent algorithms were recently proposed to scale the sample size, and asynchronous stochastic coordinate descent algorithms were proposed to scale the dimens...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • SIAM Journal on Optimization

دوره 25  شماره 

صفحات  -

تاریخ انتشار 2015